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X.  Introduction. 

Lot  I^X^,, , . . ,2^  bo  a sample  of  a one-dimensional  random  variable 
X which  has  the  continuous  emulative  probability  function  F„  It  has 
bean  observed  [lj  that,  to  the  authors*  knowledge,  all  distribution-free 
statistics  considered  in  the  past  can  be  written  in  the  fora 
$[p(x1),p(x2)f...,p(xn)]  where  $ is  a measurable  symmetric  function 
defined  on  the  unit-cube  ^Ux  0 — 1,  i ^ l,2iS.»  taJ0  It  is  the  pur- 

pose of  this  paper  to  study  the  relationship  between  the  class  of  sta- 
tistics which  can  be  written  in  this  particular  fora  and  the  class  of 
distribution-free  statistics. 


2e  Distribution-free  statistics  and  statistics  of  structure  (d). 

Let  oc*  and  be  two  families  of  cumulative  probability  functions  <, 

A reel  quantity 

W=S(X1,X2,...,Xa,Q) 

will  be  called  a statistic  in  .Q  with  regard  to  if,  for  any  G € C*t 

f e£>  , end  ,1^  in  the  a-dimensional  eampl<*-apac®  for  e random 

variable  X which  has  th®  cvssiilstlv*  probability  function  F. 

1°  „ ,1^,0)  is  defined  almost  everywhere  in  the  sample-spec® 

L ,1.,. . »,X  (i.e.  with  the  possible  exception  of  a set  of  proh- 

x z n 

ability  sero),  and 
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2^  W — 8(X-l,l2,...>In,G)  has  a probability  distribution!  this 
probability  distribution  will  be  danoted  by 

^(W|F) « P [s(x1>x2, . . . ,xn,G)  ,rj. 

For  example,  Kolmogorov’s  statistic 
(2.1)  D = sup  |Fn(x)  - G(x)  j, 

- oo<  X<  CO 

where  Ffl  is  the  oospirical  cumulative  distribution  function  determined  by 
the  sample  satisfies  1&  and  22-  when  .Q  => ~ 

the  class  of  all  non~degenerate  cumulative  probability  functions  2/,  hence 
Da  ie  & statistic  in  with  regard  to£»^. 

If  for  a statistic  ,G)  in  *Q  with  regard  to  Q* 

there  exists  a function  <£'  defined  on  the  n-dimensional  unit  cube  and 
symmetric  in  its  arguments,  such  that  for  any  G , Fe  -O7  ws  have 
8(XltX2,...,Xn9Q)  = $[G(X;L),G(X2),..,,G(Xn)J 

almost  everywhere  A/  in  the  sample  space  x,?x0,...a  for  the  random 

JL  aw  U 

variable  X which  has  the  cumulative  probability  function  F,  then  we  shall 
say  that  S(X  ,X  ,...;X  ,G)  la  a statistic  of  structure  (d)» 

J.  <>. 

Kolmogorov? a statistic  (2.1)  is  an  example  of  a statistic  of 
structure  (d),  sines  it  can  be  written  as 

V'  ,_«**  f^rv  - ¥•  i - G(II>]}> 

where  xl  Jtl.... fxl  ere  the  numbers  I,  ,Zn, . . . ordered  increasingly,. 


2/, 


The  notations  for  various  classes  of  emulative  probability  functions 
ore  those  introduced  by  Scheff*  [2}» 


I/. 


The  exceptional  set  of  probability  *aro  any  depend  on  G. 


3 


I?  Q «PV  -Ed  the  statistic  8(XxS52-,,,?Xn,G)  hat  property 
that  the  probability  distribution  ^'js(X3  ,I2,.«.»Xq,G);G^  is  independent 
of  G for  G€^,  w shell  say  that  S(Xj,22,.,..;ISi,G)  i*  * diatributioiu- 
free  atatistlc  in  jQL. 

Let  ua  now  aasuma  P*  32  -O.  the  class  of  all  continuous  cu>bu- 

2 

lative  probability  functions.  Denoting  by  K the  rectangular  distribution 
in  (0,1)  we  hare 

Pfifah* » * * ' »Q(Xa5J  <V  ‘ • * » V jH]‘ 

It  follcvs  that  if  a statistic  in  jQg  with  regard  to  has  structure 

(d)  then  it  is  distribution-free  in  .02* 

ill  distribution-free  statistics  considered  in  literature  happen  to 
hare  structure  (d),  vita  £«?-»  0/=  Nevertheless  the  conjecture 

that  every  distribution-free  statistic,  symmetric  in  X^,X2,.. .,1^, 
*UhQ-£y-&2,  ton  .tmotur.  (d)  1.  Dot  tru..  Ihl.  ««.  b. 

seen  fro*  the  following  counter-examplet 

Let  OJ^  and  CJ^  be  non-empty,  actually  exclusive  subsets  of  Q2 
such  that  <a>^  U ^ s*  Denoting  by  Fq  again  the  empirical  cumula- 

tive distribution  function  determined  by  a sample  of  slse  n,  wo  define 


r 


8 


cup  !f(x)  - F=(x)l=*  S,,  if  F 6 6J 

|*  t*  *a  A *■ 

- 00  < X < 00 

*np  jV^x)  " *(*)]  - s2'  if  F 6 ^ 


v.-  00  < X < 00 


Since  §L  and  S_  are  distribution-free  statistics  vith  the  same  pro  tv- 
x <. 

ability  distribution;  8 is  a distribution-free  statistic.  It  is,  however, 
clearly  not  a statistic  of  structure  (d). 


3*  Strongly  distribution-free  statistics. 

ut£T  be  the  family  of  all  continuous  cumulative  probability 
functions  such  that  if  Gfi£j*  then  G is  strictly  increasing  at  all  x 


I 
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for  which  0<  G(x)  < 1.  Clearly  if  G eQ#  than  the  inverse  function 
q(-1)  ji  b defined  on  the  open  unit.  interval. 

V«  sow  consider  a statistic  ?(X,  in  .Q*  with  regard 

to  soso  family  £j  of  cisxui&tdTe  probability  functions.  This  statistic 
shall  be  called  e trongly-ci * tributlon-f  ree  in  .Q*  with  regard  to 
if  tha  probability  distribution  ,G)}Fj  depends  only  on 

the  function  T-F  G^“^  for  all  Q eQ#,  F €.0/, 

It  is  easily  scan  that,  for  a strongly  distribution-free 

statistic  is  dlstgibutlon-fres.  For  if  »!,,>•••  ,Xn,G);Fj  depends 

only  on  F for  all  F9  G eQ*,  then  in  particular  ^8(X]L>X2#...tXhfQ)  jG] 

depends  only  on  G G'“^=rl,  hence  is  independent  of  C.  One  also  verifies 
immediately  that  if  a statistic  irt  with  regard  to  .Qg  has  structure  (d) 
then  It  ie  strongly  distribution-free , since  tfaen(P|$|G(X^)#G(X2),.e.,G{Xn)]}Fj 

1 G<"1)1- 

Since  ell  practically  important  distribution-free  statistics  are 
sysaetric  in  X^^X^,...  .1^  and  stronjzly  distribution-free,  as  vail,  an  of 
structure  (d),  one  again  «y  conjecture  that  under  same  fairly  general 
assumptions  these  two  properties  are  equivalent.  This  conjecture  is  found 
to  be  correct  for  Ve  have  already  seen  that  if  a statistic 

has  structure  (d)  it  is  strongly  distribution-free}  it  remains  only  to 
prove  the  converse  statements 


Theorem.  If  a statistic  V = 8(1^ in  £3*  with  regard 

to<Q*  is  -ayansetrie  in  V2 and  strongly  dictribution^rree, 

then  it  has  structure  (d). 

The  proof  of  this  theorem  makes  use  of  a lemma  which  will  be  presented 
in  the  next  sec  tion. 


i 
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4°  A leasee  t ‘ 

Let  S fcs  8 strictly  increasing  continuous  function  on  the 

closed  unit-interval,  such  that  H(0)  =?  0,  H(l)  = 1;  the  measure 

defined  by  H on  the  unit-interval  l^j  >z.|n)  the  corresponding  product- 

measure  on  the  n-dlaenalon&l  unit-cube  I.  Then,  for  any  sot  MCI 

n n 

with  > 0 and  any  £>  0,  there  exist  sets  Q^,Q2.*-»»QB  in 

«u?h  that 


S are  disjoint,  ^^-measurable,  and 

^0,  i r"  1,2, *««,n, 

2a  for  Qq  = Coopl.  U Qi  vi  have  ^4g( Qq)  > 0, 

3*  if  is  placed  on  the  y^-oxis,  i = l,2,--.,n,  then  the 
product— sot  Q =*  Q^I  Q^L . . -IQ^  in  IQ  has  the  property 


/xh°)  :«) 


>i-E- 


Prooft  it  e-ay  be  assisted  without  loss  of  generality  that  H(y)  = y. 


so  that  Uu  and  ill  are  Lobesgus  measures-  Let  C _ d< 

the  oubo  1 - y^j  < in  the  (Ij ,1^)  space,  with  the  center 
(T1.y2....,jrn>.  «na  th.  volu«  = (2^)n. 


denote 


It  is  well  known  that 


(4*1) 


ii».  v ) =i 

'L"*°  - / « /'V**'  n 


for  almost  all  points  in  M (boo  e.g.  ^2j  p-  129)*  The  euosat  of  those 
points  of  M for  which  no  two  coordinates  are  equal  and  none  is  0 or  1 has 
the  same  measure  as  H-  Let  be  the  set  of  all  points  of  K for  which  (4*1) 
bolds  and  which  have  no  two  coordinates  equal  and  no  coordinate  0 or  1- 
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Then  ’ (Rj ) — />t ^ (M)  > 0.  Let  y£,»..,y^  a point  in  M^,  and  let 
A =*  Mn/aiii  jr?,  ain  (1  - y?),  min  JyJ  - yr|L 

Id)  1 a)  1 1 1 / 

Clearly  0</}  < £,  and  for  the  Intervale 

(4«2)  Qjt  (v^  — /),  ^ )»  i~l»2»*«*»n; 

are  all  in  and  satisfy  1°  and  2°.  If  is  placed  on*  the  I^-axis 

then  the  product-set  Q = CLX  QJ,,^  in  the  cube  C„  o o • 

J-  <-  a //*71»*»«?y_ 

v J.  i! 

According  to  (4.1)  there  exists  an  ^0>  0 such  that 


,(*»> 


) > i - e 


(2?>  /b  (MncK--^ 

for  JJ  < . Choosing  < ain  ( 4"^  &nd  constructing  the  intervals 

(4*2)  one  obtains  the  required  by  the  Leona. 


5.  Proof  of  Theorenu 

Vhnu  the  randen  variable  I has  the  cusuiaitive  probability  function  F, 
the  randan  variable  I «=*  G(T)  has  tha  emulative  probability  func  ion 
E = F 0^^.  Setting  1^  = GfX^)  vs,  therefore,  have 

V = S(X1,..  . ,Xq,G)  = S few  (Ix)  , . . . ,G^  (Xn)  ,aj 

and 

(?|3<X1,...,Xa,G)j  rj - <P|s  (Ij) , . . . .O™ (In) , Gj , Pg'-1?] 

= ?P[s&<‘,1>(I1)t,.*^-1>{Xii},sj;  zj~ 

By  assumption,  this  last  probability  distribution  depends  only  on  the 
emulative  probability  function  E,  and  not  on  G.  Fran  this  and  tits 
syeraetry  aesmptian  wNvish  to  conclude  that  ,Gj 
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can  bs  written  In  the  form  of  a function  $ , independent  of 

G except  on  a set  of  H-aeasura  aero. 

To  pawe  tide,  we  assnae  that  for  ease  G^,  Gg€  -O.*  ve  have 

s[G["1>{l1),.,.,G[-1>{Ya),aj^s[a^(l1),>..,G^(lE),G2]  on  a Bet 


of  positive  H-sesaure.  Vithout  loss  of  generality  we  say  assume 

( 5.1)  oo>  k >s[aj ~Ll  (X3) , . . . ,g[-1]  (Tn)  soJ  - s[g|"1)  (Xx)  ,o c . ,g|"1)  (Xn)fG?]>/|>0 

ca  «.  set  M in  the  unit  cube  IQ,  where  H Is  symmetric  and  has  positive 

measure.  For  any  H,  continuous  and  strictly  increasing  is  I1f  and  any 

£ > 0,  we  construct  sets  Q ,Q  according  to  the  Lessaa  in  Sec- 

1 2 n 

ticn  4 and  have 


(5.2) 


(Q  T M) 

LJU >i 

AD>(* 


e. 


For  any 


(5.3) 


*^0|  i = 0,l,...,n 


we  define  the  set  function 


<r 

n 


(i  !=i 


j:=0 


' /W  ■ 


for  any  measurable  TC.T  . This  clearly  is  c probability  measure  in  X-^o 
Taking  for  T the  interval  (C,y)  we  obtain  a strictly  increasing  continuous 
cumulative  probability  function  which  will  be  denoted  by  „ 

***  ^Vq0 

Vithout  loss  of  generality,  S may  be  assumed  bounded,  since  otherwise 


Thie  assures  the  existence  of  the  matheso&tl- 


we  could  consider 


n-jsl 


cal  expectation  of  S.  Since  S|G^  (Xn ) , . • . ,G^“ ''  (1^)  ,G^  j and 
sk"1)(I1),...4“1)(X  ) *G0  j hav®  the  same  probability  distribution  if 

file  a sa®ple  of  a randoa  variable  Y with  the  Ciszul&tive 

probability  function  K ^ their  mathmatical  expectations  are 

t<n» 

equal 

(5.4)  s(s  [4'1’  (\) Of'1)  (I,,)  .Oj]  - 8[o<-»  <jy Of'1)  (Yn)  .. 

Using  the  abbreviations 

a f.(-l)  f-v  \ ^ S I *8  Yl  1=12 

'n^*  IJ  jj' » 1 

ve  write  the  left-hand  side  of  (5*4)  explicitly 

r — / ra,(i1#-.*iB > - 8,<i  >1  fi  <*  (ij« 

X,»0  V^qL1  1 n * x nJi=l  V—'^n  1 

X n 

(5.!}=|:  ...z;  f ...(  [sl(il....,in)^(i1.....iJ>>]  g 

*1°  Jn0I]|eQJ  A 


<r,  ...  <r 


^ V**  ^n  f ( r 

^(%)-/i<\>  i,  Q3  [8l(Tl ^ T" 

1 3 

<SH(I  ),..<JH(Y,)e 
n i 

Since  8,(1,, S„(Y, ,...,Y_)  and  M are  symmetric  in  I, 

U A A U am 

all  the  terms  of  the  sun  which  correspond  to  different  permutations  of  the 
sane  a subscripts  (out  of  the  n j-i  possible  values  G,i,..»,n} 

are  equal.  Collecting  these  equal  terra*,  ve  obtain  a polynomial  in 

ft 

0CnS  <T,  <T  , which  according  to  (5«4)  vanishes  identically  under  the 

V Ji  u 


restrictions  (5»3).  It  follows  that  each  of  ‘he  integrals  In  the  last 
tsra*  of  (5*5)  oust  vanish;  and  in  particular 

f f ...  / [wv-v  - y1^ v]a,-i!;')ii  - °> 

hOz  an 

which,  for  £ sufficiently  aatil,  contradicts  (5*1)  and  (5*2) . 
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